Goto

Collaborating Authors

 conversational ai agent


IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are transforming artificial intelligence, evolving into task-oriented systems capable of autonomous planning and execution. One of the primary applications of LLMs is conversational AI systems, which must navigate multi-turn dialogues, integrate domain-specific APIs, and adhere to strict policy constraints. However, evaluating these agents remains a significant challenge, as traditional methods fail to capture the complexity and variability of real-world interactions. We introduce IntellAgent, a scalable, open-source multi-agent framework designed to evaluate conversational AI systems comprehensively. IntellAgent automates the creation of diverse, synthetic benchmarks by combining policy-driven graph modeling, realistic event generation, and interactive user-agent simulations. This innovative approach provides fine-grained diagnostics, addressing the limitations of static and manually curated benchmarks with coarse-grained metrics. IntellAgent represents a paradigm shift in evaluating conversational AI. By simulating realistic, multi-policy scenarios across varying levels of complexity, IntellAgent captures the nuanced interplay of agent capabilities and policy constraints. Unlike traditional methods, it employs a graph-based policy model to represent relationships, likelihoods, and complexities of policy interactions, enabling highly detailed diagnostics. IntellAgent also identifies critical performance gaps, offering actionable insights for targeted optimization. Its modular, open-source design supports seamless integration of new domains, policies, and APIs, fostering reproducibility and community collaboration. Our findings demonstrate that IntellAgent serves as an effective framework for advancing conversational AI by addressing challenges in bridging research and deployment. The framework is available at https://github.com/plurai-ai/intellagent


iFetch Talking Series

#artificialintelligence

This week marks the start of a series of talks about the iFetch project that I'll be giving. Our goal with iFetch is to create a trustworthy multimodal conversational agent for the online fashion marketplace. How do you get away from question-answering chatbots? How can multimodality be used to address various customers' intents? What should be the best policy to address our customer's goals and be capable of responding in the right tone?


Stefano Somenzi, Athics: On no-code AI and deploying conversational bots

#artificialintelligence

No-code AI solutions are helping more businesses to get started on their AI journeys than ever. AI News caught up with Stefano Somenzi, CTO at Athics, to get his thoughts on no-code AI and the development of virtual agents. AI News: Do you think "no-code" will help more businesses to begin their AI journeys? Stefano Somenzi: The real advantage of "no code" is not just the reduced effort required for businesses to get things done, it is also centered around changing the role of the user who will build the AI solution. "No code" means that the AI solution is built not by a data scientist but by the process owner.


The Latest Breakthroughs in Conversational AI Agents

#artificialintelligence

First, Google's chatbot Meena and Facebook's chatbot Blender demonstrated that dialog agents can achieve close to human-level performance in certain tasks. Then, OpenAI's GPT-3 model made lots of people wonder whether Artificial General Intelligence (AGI) is already here. While we are still a long way off true AGI, conversations with GPT-3 based chatbots can be very entertaining. Are you interested to learn more about the latest research breakthroughs in Conversational AI? Check out our premium research summaries covering open-domain chatbots, task-oriented chatbots, dialog datasets, and evaluation metrics. Subscribe to our AI Research mailing list at the bottom of this article to be alerted when we release new summaries.


Personalized Query Rewriting in Conversational AI Agents

arXiv.org Artificial Intelligence

Spoken language understanding (SLU) systems in conversational AI agents often experience errors in the form of misrecognitions by automatic speech recognition (ASR) or semantic gaps in natural language understanding (NLU). These errors easily translate to user frustrations, particularly so in recurrent events e.g. regularly toggling an appliance, calling a frequent contact, etc. In this work, we propose a query rewriting approach by leveraging users' historically successful interactions as a form of memory. We present a neural retrieval model and a pointer-generator network with hierarchical attention and show that they perform significantly better at the query rewriting task with the aforementioned user memories than without. We also highlight how our approach with the proposed models leverages the structural and semantic diversity in ASR's output towards recovering users' intents.


Migratable AI : Investigating users' affect on identity and information migration of a conversational AI agent

arXiv.org Artificial Intelligence

Conversational AI agents are becoming ubiquitous and provide assistance to us in our everyday activities. In recent years, researchers have explored the migration of these agents across different embodiments in order to maintain the continuity of the task and improve user experience. In this paper, we investigate user's affective responses in different configurations of the migration parameters. We present a 2x2 between-subjects study in a task-based scenario using information migration and identity migration as parameters. We outline the affect processing pipeline from the video footage collected during the study and report user's responses in each condition. Our results show that users reported highest joy and were most surprised when both the information and identity was migrated; and reported most anger when the information was migrated without the identity of their agent.


Uber Has Been Quietly Assembling One of the Most Impressive Open Source Deep Learning Stacks in…

#artificialintelligence

Artificial intelligence(AI) has been an atypical technology trend. In a traditional technology cycle, innovation typically begins with startups trying to disrupt industry incumbents. In the case of AI, most of the innovation in the space has been coming from the big corporate labs of companies like Google, Facebook, Uber or Microsoft. Those companies are not only leading impressive tracks of research but also regularly open sourcing new frameworks and tools that streamline the adoption of AI technologies. In that context, Uber has emerged as one of the most active contributors to open source AI technologies in the current ecosystems.


PolyAI scores $12M Series A to put its 'conversational AI agents' in contact centres

#artificialintelligence

PolyAI, a London startup founded by experts in the field of "conversational AI" -- including CEO Nikola Mrkšić, who was previously the first engineer at Apple-acquired VocalIQ -- has raised $12 million in Series A funding to deploy its tech in customer support contact centres. The round was led by Point72 Ventures, with participation from Sands Capital Ventures, Amadeus Capital Partners, Passion Capital and Entrepreneur First (EF). PolyAI's founders are graduates of EF, although they didn't meet during the company building program but already knew each other from their time at Cambridge's Dialog Systems Group, part of the Machine Intelligence Lab at the University of Cambridge. "We started PolyAI in 2017, straight after submitting our PhD theses," Mrkšić tells me. "At Cambridge, we developed state-of-the-art conversational technology, and starting a company was the best way to get this tech used in the real world. We brought many of our Cambridge colleagues with us and started building the commercial version of our conversational platform."